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CELL is back! 
xX After a one-year intermis- 

sion caused by lack of ed- 
iting time (certainly not by lack of 
material), XCELL is back, and will 
again bring you technical infor- 
mation and timely availability 
updates on Xilinx devices and de- 
velopment software. 

The XC4000 family now has 
three devices in full production, 
withone moreavailable in sample 
quantity. Many new packaging 
and speed options for XC3000 and 
XC4000 devices are now available 
(see page 2). 

Early this year, we acquired 
Plus Logic, and made it the EPLD 
division of Xilinx, dedicated to Pro- 
grammable Logic Devices based 
onEPROM technology. TheAND/ 
OR structure and predictable per- 
formance of these devices are 
popular with designers accus- 
tomed to PAL* devices. The two 
presently available Xilinx devices 
offer interesting advantages in 
speed and density. 

Xilinx development systems 
have made great strides to become 
more powerful and easier to use. 

We have also improved our 
applications support and have 
published the first version of a 
Xilinx Applications Handbook, ap- 
propriately called XAPP. 

Expect more devices, new 
technologies, new and better soft- 
ware, additional interfaces and 
platform support from Xilinx and 
third-party vendors. And expect 
to read about it in XCELL. 

Welcome back! 

Peter Alfke, Editor 
©1992 by Xilinx, Inc. All rights reserved 
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Two New Software Tools: 
X-BLOX and Xilinx ABEL 


Xilinx has recently introduced 
two new software tools that 
shortendesign cycles, and increase 
productivity. X-BLOX™ and Xilinx 
ABEL software tools provide in- 
telligent, high-level design entry 
capabilities that significantly re- 
duce theneed for gate-level design. 

Using X-BLOX tools with ex- 
isting design-entry tools, logiccan 
be designed at the block-diagram 
level. Functional blocks, like 
adders and registers, can be speci- 
fied in a schematic, and incorpo- 
rated directly into the design 
without detailed logic design. The 
X-BLOX synthesis design tool, 
however, is much more than a 
macro library. 

A rule-based expert system 
creates optimized Hard Macros for 
each of the X-BLOX modules in 
the schematic. Data paths of any 
widthare automatically accommo- 
dated, and the XC4000 dedicated 
carry logic is used wherever 
appropriate. The result is an XNF 
file that can be placed and routed 
just like any other. 

Xilinx ABELalso augments ex- 
isting design entry tools, by per- 
mitting functional blocks to be 
defined using the industry stan- 
dard ABEL” Hardware Descrip- 
tion Language instead of gates. As 
with X-BLOX tools, the resulting 
XNF file can be placed and routed 
using the standard Xilinx tools. 





Targeted primarily at auto- 
mated state-machine design, Xilinx 
ABEL uses One-Hot encoding. 
This technique exploits the large 
number of flip-flops contained in 
LCA devices, while minimizing 
the impact of the CLB fan-in re- 
striction. One-Hot encoding usu- 
ally provides the highest 
performance state machines. 

More detailed descriptions of 
X-BLOX and Xilinx ABEL software 
tools appear later in this issue. 


* PAL is a trademark of AMD, 
” ABEL is a trademark of Data I/O Corp. 
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Current Software List 


The following isa list of the current software revision levels for Xilinx’s development system products, as of June 1, 1992. 





XCHECKER-PC1 ver. 1.00 DS355-PC1 OrCAD VST Pre-Release XACT 2000/3000 Development System 
XCHECKER-WS ver. 1.00 DS371 Xilinx ABEL ver. 1.01 DS501-PC1 ver. 3.20 

DS112 Enhanced Serial Configuration DS380 X-BLOX ver. 1.01 DS501-SN2 ver. 3.20 on SUN4 
PROM Programmer ver. 3.20 DS381 Cadence Design Kit ver. 4.00° DS5O1-AP1 ver. 3.15 on Apollo 
BSZ2:FCLE:SILDS Wers4s10 | DS390-PC1 VIEWdraw-LCA ver. 4.13 | XACT 2000/3000/4000 Dev. System 
ESZOD- SC LVIEWsia er A:1s DS391-PC1 VIEWIogic Interface DS502-PC1 ver. 1.21 


DS310-PC1 DASH-LCA ver. 4.0 ver.4.13 DS502-SN2 ver. 1.21 on SUN4 


0DS343-AP1 Mentor Interface ver.4.02 | DS396XEPLD Workview Library ver. 3.10 DS502-AP1 ver. 1.11 on Apollo 
* See page 8 for details. 
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Similar in purpose to FPGAs, 
complex erasable programmable 
logic devices, commonly referred 
to as EPLDs, combine the 
advantages of LSI - smaller size, 
less cost, higher reliability — with 
the user’s need to create applica- 
tions-specific circuits without in- 
curring the cost, delay, and risk of 
mask-programmed gate arrays. 

Different from FPGAs, the 
EPLD architecture is based on 
programmable logic array 
technology for both the functional 
logic and the interconnect 
structure. Each device contains a 
number of programmable units, 
called Function Blocks, each 
containing nine output macrocells 
driven by a programmable AND/ 
OR array. A programmable 
Universal Interconnect Matrix 
(UIM ) routes any device input or 
any macrocell output to the input 
of any Function Block, completely 
eliminating the issue of routability. 

This unrestricted program- 
mable interconnect structure, 
combined with the familiar AND/ 
OR logic of the traditional PAL 
architecture, makes EPLDs easy to 
use and easy to understand. 

The delay through a Xilinx 
EPLD device is not only predict- 
able, but also constant. Any func- 





tion that can be implemented in 
one pass through the device can 
run at the maximum specified 
device speed, 33 or 40 MHz. 

Xilinx EPLDs offer two unique 
advantages over competing EPLD 
devices. 

° The XC7236 and XC7272 
contain dedicated high-speed 
arithmetic carry logic for efficient 
implementation of fast adders, 
subtractors, accumulators, and 
comparators. This overcomes a 
traditional EPLD weakness. 

* The UIM can perform a 
logic-AND function without ad- 
ditional delay, which means that 
complex counters of any practical 
length (32 bits in the XC7236, 64 
bits in the XC7272) can run at full 
speed, even synchronously 
loadable up/down counters. 

No other programmable 
technology comes close to this 
performance. Traditional PLDs 
supportno more than 16bitsat full 
speed, and all channel-routed 
FPGA devices must concatenate 
thecarry chain, resulting inslower 
operation for longer counters. 

The Xilinx EPLD Data Book 
provides detailed information on 
the two available Xilinx EPLDs, 
the XC7236 and the XC7272. 








(largest package) 
EXILUNX 


XC7236 


XC7272 
Number of Macrocells 36 72 
Number of Function Blocks 4 8 
Number of inputs to each Function Block 21 21 
Number of product terms per Function Block 57 57 
Total number of available product terms 228 456 
Maximum number of p-terms available 17 16 
per Macrocell logic function 
Total number of signal pins 36 72 
(input, output, I/O) (largest package) 
Maximum number of pins available for input 32 54 
(largest package) 
Maximum number of pins available for output 34 60 





Xilinx EPLD Architecture 


XC7236 and XC7272 


The XC7272 is a design revi- 
sion of the original Plus Logic 
FPGA 2020, while the XC7236 is a 
design revision of the original Plus 
Logic Hiper 2010. The product 
nomenclature was changed to de- 
note the number of macrocells in- 
stead of theless relevant gate-count 
number. As the names imply, the 
XC7236 has 36 macrocells, and the 
XC7272 has 72. The XC7236 is the 
more recent design; it incorpo- 
rates some feature enhancements 
beyond the XC7272. The following 
paragraphs and table describe the 
extended features of the XC7236 
that are supported by the current 
version of the XEPLD translation 
software, version 3.1. 

* The XC7236 has a direct 
feedback path from the macrocell 
output to the OR input of either 
the same macrocell, or the 
neighboring macrocell. This 
speeds up the direct feedback, 
especially in counters and 
arithmetic circuits, and it saves 
UIM connections. 

¢ The XC7236 has one 
additional FastClock input, for a 
total of three. 

* The XC7236 has selectable 
logic polarity on all device inputs 
to the UIM and on device outputs, 
independent of the feedback. 

¢ The XC7236 allocation of 
private and shared product terms 
among the two OR gates feeding 
the ALU is more efficient; and the 
ALU function is more streamlined. 
These differences may result in 
better functionality, but the 
software usually isolates the user 
from such details. 
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The proliferation of program- 
mable logic manufacturers and ar- 
chitectures has created confusion 
in the user community. How fast 
and how dense are these compet- 
ing devices? How can I compare 
Xilinx against Actel, or against 
Altera? 

Some vendors have made 
benchmark claims not only for 
their own devices, but also for 
their competitors. This has lead to 
ridiculously misleading state- 
ments: Citing four specific bench- 
marks, one competitor recently 
claimed, in public, a two-times 
speed advantage over the XC4005, 
when in reality, the XC4005 ex- 
ecutes these benchmarks 71% 


Benchmark Wars 


Nobody should believe any 
benchmark claims that are based 
on one vendor's evaluation of his 
competitor. A mixture ofignorance 
and marketing enthusiasm will 
inevitably distort the results and 
make them meaningless. Let ev- 
ery manufacturer demonstrate the 
performance of his devices; leave 
the comparison to the user. 

PREPco, an independent 
company headed by Stan Baker, 
known as an editor of Electronic 
Engineering Times, is coordinat- 
ing an effort by all PLD manufac- 
turers (including Xilinx) to come 
up with a set of standardized 
benchmarks. Each of us will report 
on our own devices, but all our 








not be the answer to every ques- 
tion, and there is room for im- 
provement, for biggerand perhaps 
more meaningful benchmarks. But 
we have made an historic begin- 
ning. Expect detailed results from 
PREPco late this year: 


Programmable Electronics 
Performance Corporation 
504 Nino Ave., LosGatos, CA95032 
Phone: (408) 356-2169 
Fax: (408) 356-0195 


In the meantime, Xilinx has 
collected some benchmark data to 
explain the performance of typical 
circuits in three different Xilinx 
architectures: XC3000, XC4000, 




































































faster. Another competitor has | claims will be verified by our | and XC7200 EPLD. 
given lengthy comparisons be- | toughest competitor. These PA 
tween anti-fuse-based FPGAs, | benchmarks will, therefore, be ac- 
SRAM-based FPGAs, and EPLDs. | curateand trustworthy. They may 
Most of the “results” were tainted. 
Xilinx Benchmark Data 
XC7200 EPLD (-25) | __XC3000 FPGA (-150) XC4000 FPGA (-5) 
_16-Bit State-skipping Counter, Presettable, non-binary = na 150 MHz | 18CLBs | 111MHz | 12CLBs 
| _16-Bit Binary Counter Max Speed 40 MHz 116MHz | 24CLBs | 111MHz | 17CLBs 
16-Bit Unidirectional, Loadable Counter Max Density 40 MHz 20 MHz 16CLBs | 40MHz 8 CLBs 
“ _ Max Speed 40MHz | 34MHz | 23CLBs_ | 42MHz | 9CLBs 
16-Bit Up/Down Counter Max Density 40 MHz 20MHz | 16CLBs | 40 MHz 8 CLBs 
= Max Speed | __40 MHz 30MHz | 27CLBs_| 40MHz | 8CLBs 
16-Bit Loadable, Up/Down Counter Max Density 40 MHz 20MHz | 16CLBs | 30MHz | 16CLBs 
= mi ___Max Speed 40 MHz 30MHz | 27CLBs_ | 30MHz | 16CLBs 
_16:1 Multiplexer - 25ns 16 ns 8 CLBs 16 ns 5 CLBs 
16-Bit Decode from Input Pad 25 ns 15ns__ | 4 CLBs 8 ns 0 CLBs 
24-Bit Accumulator 17 MHz 25 MHz 46CLBs | 32MHz | 13CLBs 
Data Path Benchmark 40 MHz 60MHz | 16CLBs | 90MHz | 12CLBs 
(32 inputs, 4:1 mux, register, 8 bit shift register) | | - —_ 
Timer/Counter Benchmark 40 MHz 30 MHz 21CLBs | 40MHz | 21CLBs 
im (8-bit timer/counter, latch, mux, compare) | 
State-Machine Benchmark 40 MHz - 44MHz | 13CLBs 
___(16 states, 40 transitions, 10 inputs, 8 outputs) = | 
Arithmetic Benchmark 12 MHz 18 MHz 23CLBs | 18MHz | 21CLBs 
(4x4 multiplier, 8 bit accumulator) 
16-Channel, 32-Bit DMA na na 20MHz | 72CLBs 

















Notes: 
1. All speeds are worst-case temperature and voltage. 
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2. System speeds for slower parts, e.g. XC3000-100, -70, can be approximated by derating appropriately (0.67 for -100, 0.47 for -70) 





a, 





No. 





Ten X€4010 Density Benchmarks 


Application 


XC4010 
Total Gate Count 





16-Bit Barrel Shifter or Rotator 
32CLBs i.e. 12 circuits per XC4010 
496 gates, all combinatorial, 15.5 gates per CLB 


5,952 





24-Bit Accumulator 
12 CLBs i.e. 33 circuits per XC4010 
583 gates, 168 of them in flip-flops, 48 gates per CLB 


19,239 





32-Bit Identity Comparator 
9 CLBs i.e. 44 circuits per XC4010 
135 gates, all combinatorial, 15 gates per CLB 


5,940 





9-Bit Parity Checker 
1CLB i.e. 400 circuits per XC4010 
28 gates, all combinatorial, 28 gates per CLB 


11,200 








16-Input Multiplexer 
5 CLBs i.e. 80 circuits per XC4010 
31 gates, all combinatorial, 6 gates per CLB 





16-Bit Loadable Counter 
8 CLBs i.e. 50 circuits per XC4010 
280 gates, 112 of them in flip-flops, 35 gates per CLB 





100-MHz 24-Bit Programmable Divider 
16 CLBs, i.e. 25 circuits per XC4010 
400 gates, 180 of them in flip-flops, 25 gates per CLB 





2,480 


14,000 


10,000 





16 x 8 FIFO 
14 CLBs i.e.28 circuits per XC4010 
600 gates, 48 in FF, 512 in RAM, 42 gates per CLB 


16,800 





32 x 8 Shift Register 
11 CLB i.e. 36 circuits per XC4010 
conceptually 1536 gates, all in flip-flops, 139 gates per CLB 


55,296 





10 
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BORNADRONS 


32 x 16 RAM 
16 CLBs i.e. 25 circuits per XC4010 
2100 gates, 2048 in RAM, 131 gates per CLB 





16-input mux = 31 gates x 16 = 496 gates. 





The source for these gate count values is the LSI Logic Data Book of July 87: 


52,500 


16-bit fast adder (LSI page 3-99) = 277 gates x 1.5 = 415, +24 flip-flops x 7 gates = 583 gates. 
8-bit comparator (LSI page 3-74) = 30 gates, x4 = 120 + 4-bit comparator, total: 135 gates. 


9-bit parity (LSI page 3-163) = 28 gates 
16-input mux = 31 gates. 
74161 = 70 gates (LSI page 3-115), x4 = 280 gates. 


Conservative estimate: less than 1.5 x #6 for same functionality. 


128 latches x 4 gates = 512 gates. 
256 flip-flops x 6 gates = 1536 gates. 
512 latches x 4 bits = 2048 gates plus some addressing. 


PA 
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XC3000-150 


Xilinx is currently sampling 
XC3000 devices in the new -150 
speed grade. These devices are 
15% — 20% faster than XC3000-125 
devices. 

The T,,, delay, often used as 
an LCA benchmark, is reduced 
from 5.5 to 4.6 ns, with similar 
reductions in the other timing 
specifications. As a result, a 16-bit 
loadable counter, that operates at 
28.5 MHz in the XC3000-125, can 
be clocked at 34 MHz in the 
XC3000-150, an improvement 
of 19%. 

In new designs, it is some- 
times possible to trade the addi- 
tional speed against CLB usage. 
For example, in the XC3000-125, a 
16-bit carry-lookahead adder 
settles in 36 ns, while a more com- 
plicated and more costly condi- 
tional-sum adder requires only 
26.5 ns. In the XC3000-150, the set- 
tling time of the simple carry- 
lookahead adder is reduced to 
29.5 ns. This provides most of the 
performance advantage of the 
XC3000-125 conditional-sum 
adder, but requires 25% fewer 
CLBs. 

If you would like to simulate 
your design in an XC3000-150 de- 
vice, a new speeds file is available 
through the Xilinx Technical Bul- 
letin Board. The speeds file is the 
specification data base used by 
simulators when calculating per- 
formance. This file may be down- 
loaded to replace the one currently 
in your system. 

Contact Xilinx for a copy of 
the XC3000-150 Data Sheet. Pro- 
duction quantities of XC3000-150 
devices are expected to be avail- 
able in September ’92. 
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State Machines Using Xilinx ABEL 


The traditional medium for 
logic design is the schematic dia- 
gram. Sometimes, however, the 
system design process naturally 
leads to equations or truth tables. 
In these cases, conversion to gates 
is an unnecessary extra step. To 
eliminate it, Xilinx has introduced 
the Xilinx ABEL software package. 

This software package permits 
blocks of logic to be defined using 
the ABEL* High-level Design Lan- 
guage. These ABEL definitions can 
be compiled into LCA netlists 
without having to represent the 
logic as gates. An additional ben- 
efit is that Xilinx ABEL software 
optimizes the implementation of 
the logic to fit the LCA architecture. 

State machines are especially 
suitable fordefinition by equations 
or tables. A particular state is en- 
tered if, and only if, the current 
state and the control inputs to the 
state machine meet a pre-deter- 
mined set of criteria. Listing these 
criteria describes the statemachine 
completely. 

Given the description of the 
state machine, the Xilinx ABEL 
software implements it using a 
technique that is well-suited to the 
LCA architecture. While LCA de- 
vices provide a large number of 
flip-flops, the CLB function gen- 
erators have limited fan-in, and 
this environment favors One-Hot 
Encoded (OHE) state machines. 

Inan OHE state machine, also 
known as a state-per-bit encoded 
state machine, one flip-flop is as- 
signed to each state. While fewer 
flip-flops could be sufficient if 
states are encoded, flip-flips are 
not normally a critical resource in 
LCA designs; the critical resource 
is CLB function-generator inputs. | 





OHE minimizes the complex- 
ity of the next-state logic associ- 
ated with each flip-flop by 
spreading the taskacross the larger 
number of flip-flops. With OHE, a 
flip-flopis only set whenits specific 
state is entered. Identifying the 
current state by a single bit makes 
it easier to combine the current 
state information with the control 
inputs. 

State machines using OHE are 
typically faster than those using 
conventional state encoding. The 
reduced logic complexity results 
in fewer levels of CLBs to define 
the operation of each flip-flop. 
Consequently, logic delays are 
lower, and clock rates can be higher. 

The inexpensive Xilinx ABEL 
software is not limited to state 
machine design. The ABEL lan- 
guage can be used to define any 
logic that is more conveniently 
defined using equations rather 
than gates. 

BN 


* ABEL is a trademark of Data I/O Corp. 
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Have you ever sketched a 
block diagram on the back of an 
envelope, and wondered how well 
itwould work? X-BLOX tools make 
it easy to find out. With the X- 
BLOX synthesis design tools, you 
can enter your block diagram into 
the existing design-entry 
package, and then have it 
automatically implemented in an 
XC4000 LCA device. 

Using X-BLOX, a library of 30 
frequently used block-diagram 
functions are available to construct 
designs. This library contains reg- 
isters, adders, and counters; even 
RAMs and ROMsare provided. If 
necessary these functions can be 
combined with gate-level logicand 
any other library elements that are 
available. 

The X-BLOX library is not just 
acollection of macros, however. A 
rule-based expert system converts 





each X-BLOX module in your de- 
sign into an optimized custom 
Hard Macro. These Hard Macros 
are implemented in a way that 
best exploits the LCA architecture. 

Wherever possible, the Hard 
Macros utilize the advanced fea- 
tures of XC4000 LCA devices. All 
adder and counter macros exploit 
the dedicated carry logic for maxi- 
mum performance and minimum 
CLB count. RAM and ROM mac- 
ros are automatically created by 
the X-BLOX software, and 
MEMGEN is not required. 

Hard Macros improve the per- 
formance of the design by carry- 
ing the functional structure of the 
design into the implementation 
phase. CLBs that are common toa 
particular function are kept to- 
gether in the array. This improved 
placement enhances both the 





X-BLOX Provides High-Level Schematic Entry 


routability and the performance. 

X-BLOX tools make it easy to 
changedesigns. Function modules 
are interconnected by single-line 
busses that are easily modified. 
Consequently, adding or remov- 
ing modules is a simple task. 

Bus widths are controlled bya 
single parameter located anywhere 
in a data path. Editing this one 
parameter not only changes the 
bus width throughout the data 
path, but alsochanges the width of 
all the functional blocksin the path. 
Even RAMsautomatically change 
their depth to match the number 
of address bits. 

High-level design with the 
X-BLOX design tools shortens the 
design cycle without sacrificing 
performance. Your engineering 
productivity is increased; more 
importantly, your product goes to 
market sooner. 

BN 





In late January, we installed a 
call tracking and problem resolu- 
tion database for the Technical 
Support Hotline. Instead of hav- 
ing to keep individual notes about 
phone calls, our Applications En- 
gineers (AEs) now have quickand 
easy access to a central record of 
customers, technical questions, 
and known problems. 

When youcall the Hotline, our 
Customer Response Center (CRC) 
asks you for your name and then 
calls up your record. After your 
first call, you never have to tell us 
again the details of your company 
address and phone number, the 
type of Xilinx software that you 
have installed, and the computer 
it runs on. 

Attached to your record are 
files for each of your previouscalls. 
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Each call record includes informa- 
tion about: when the call was 
opened, the Applications Engineer 
to whom it was referred, the topic 
of your question or request, notes 
about your questions, whether the 
record is open orclosed, and when 
it was closed. The CRC staff can 
see your entire call history at once, 
and refer your call to the Applica- 
tions Engineer best able to answer 
your question. 

The AE can browse through 
your call history and see what you 
have discussed with other engi- 
neers during previous calls to the 
Hotline. 

If no AE is available to take 
your call, the CRC staff queues the 
call so that the next available AE 
can accept it and call you back. 
Similarly, if you want to leave a 





New Applications Data Base 


message for an engineer who is 
not on phone duty that day, the 
CRC staff will queue it to the AE’s 
message bin. 

The database is not only for 
call tracking, it also contains 
records of common problems and 
their solutions. Using symptom 
keywords from your description 
of the problem, the AEs can search 
the database, using the captured 
expertise of our entire Applica- 
tions group. 

Inthe future, we plan to make 
this part of the database available 
to our users. You'll be able to 
browse through the Customer 
Access Database and either 
download or fax back to yourself 
detailed explanations of known 
problems. Look for it next year. 

DF 
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ADI Version 3.20 Now Shipping 


Xilinx has just released the lat- 
est version of the ADI software 
used with XC2000 and XC3000 
LCA devices. Many improve- 
ments have been made resulting 
in more demanding designs being 
routed automatically. 

The new algorithm signifi- 
cantly improves both the density 
and the performance of almost 
any design. Using the new algo- 
rithm, most designs that failed to 
route with the old algorithm are 
now completed automatically. 

When compared using the 
Xilinx rogues’gallery of difficult 
designs, ADI version 3.20 com- 
pletely routed more designs than 
before, and provided implemen- 
tations that operated faster. 
Specifically designed to route 
LCA devices, the new router 
outperforms third-party tools. 

The new router uses net de- 
lays to direct its operation. Nets 
that are routed early have greater 
access to routing resources. If nets 





thatare routed later become exces- 
sively slow, previously routed nets 
can be “ripped up” to accommo- 
date them. Similarly, previously 
routed nets can be modified to ac- 
commodate those that cannot be 
routed with the remaining re- 
sources. 

Flagnet and Weightnet con- 
straints should no longer beneces- 
sary with the new router, and may 
even be counter-productive. 
Flagged nets are routed first, and 
cannot be ripped up. Conse- 
quently, the resources they use 
cannot be re-allocated by the 
router, and slow or unrouted nets 
may result. 

Along with the new routing 
algorithm, there is a new place- 
ment algorithm. The new algo- 
rithm operates much faster than 
the old one, yet provides results 
that are almost as good. Selecting 
the new algorithm with the -Y op- 
tion reduces the LCA compilation 
time, and thereby increases design 





productivity. 

Support for 3-state busses is 
greatly improved. TBUFs with a 
common 3-state control are auto- 
matically aligned inacolumn. This 
permits a vertical Longline to be 
used for the 3-state control. In ad- 
dition, logic associated with the 
TBUFs is placed close to this col- 
umn, improving both performance 
and routability. 

TBUFs can also be named, 
using the BLKNMattribute. Nam- 
ing TBUFs permits them to be re- 
ferred to easily in constraints files, 
improving the user’s ability to 
floorplan within the CLB array. 

To improve the floorplanning 
of inputs and outputs, IOBs can 
now be constrained to one edge of 
the device, or to half an edge. This 
constraintis often adequate to sim- 
plify PCB layout. However, it has 
a much smaller impact on LCA 
routing than locking a signal to a 
specific pin. 





Fully Integrated Design Environment from Cadence 


Under a new OEM agreements, 
Cadence Design Systems will in- 
tegrate the Xilinx XACT 
development system into the 
Composer* design environment. 
This combination will provide a 
complete top-down-design envi- 
ronment and the convenience of a 
single software vendor, since the 
complete package will be sold and 
supported by Cadence. 

In addition to entering 
designs as schematics, high-level 
design languages such as 
VERILOG-XL* and VHDL-XL* 
will be available to designers. 
X-BLOX will also be integrated 
into the design environment. 
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The top-down-design process 
will be timing-driven, with board- 
level timing objectives being 
passed to the FPGA implementa- 
tion software. Timing information 
from the resulting FPGA design 
will then be passed back to the 
board-level model for simulation. 

This closed-loop approach 
permits systems designers greater 
visibility into their designs, and 
greater control over the design 
process. Consequently, system re- 
quirements will be met more reli- 
ably, and in less time. 





AIILCA families are to be sup- 
ported, and FPGA design kits will 
be available from Cadence, start- 
ing in the third quarter of ‘92. Ini- 
tially, the software will run on Sun 
workstations, with other platforms 
to follow. Valid design kits will 
also be available. 


* Composer, VERILOG-XL and VHDL-XL 
are trademarks of Cadence Design Systems, 
Inc. 





Version 1.20 of the XC4000 par- 
tition, place and route software 
(PPR) is now available. This ver- 
sion offers designers several new 
features, aimed at improving per- 
formance and increasing produc- 
tivity. Partitioning, placement and 
routinghave been enhanced. 

XC4000 designers can now 
control logic partitioning at the 
schematic level. New FMAPs and 
HMAPs operate similarly to 
CLBMAPs in XC3000; the user can 
specify how gates are combined 
into CLB function generators. 

In critical paths, the logic par- 
titioning affects the performance 
of a design. FAAPs and HMAPs 
force PPR to use the partitioning 
envisioned by the designer. Con- 
sequently, performance require- 
ments are more predictable. 

The placement process, which 
often has the greatest effect on per- 
formance, is initialized by a ran- 
dom seed value. This component 
of randomness causes repeated 
runs of PPR to result in different 
placements. The new version of 
PPRcanautomatically evaluate the 
placements resulting from a user- 
defined number of seed values, 
and choose the best. 

A new routing feature also 
helps ensure that performance re- 
quirements are met. Version 1.20 
permits a maximum net delay to 
be specified. PPR continues re- 
routing the design until the long- 
est net delay is less than this 
maximum value. 

If PPR is unable to meet the 
specification, after a number of 
attempts determined by the user, 
it tries to meet a second, less de- 
manding specification. If this re- 
quirement is also unattainable, 
PPR increases the delay specifica- 
tion in fixed increments until the 
requirement can be met, or until 
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Enhancements to PPR 


the delay exceeds a worst-delay 
specification. In the lattercase, PPR 
leaves unrouted those nets that do 
not meet thespecification. All these 
delay specifications and incre- 
ments are user definable. 


Other features of version 1.20 
include better support for 
the wide edge decoders, and im- 
proved constraint handling. The 
new version also provides better 
support for back annotation of the 
schematic. 





Control XC4000 Placement 
with Hard Macros 


Hard Macros in XC4000 are 
often associated with adders and 
counters that use the dedicated 
carry. However, Hard Macros can 
improve the performance of most 
functions that benefit from con- 
trolled placement for efficient rout- 
ing. To supplement the XC4000 
Hard Macro Library, which con- 
tainsmany common functions, the 
program HMGEN lets the user 
create his own Hard Macros. 

Unlike Soft Macros that define 
only the logic function, a Hard 
Macro contains additional 
information that defines relative 
CLB placementand pin utilization, 
and may also include routing 
information. However, it is easy 
for the autorouter to provide 
efficient routing once a good CLB 
placement has been defined. 

Once constructed, Hard 
Macrosare used in schematics like 
any othermacro. PPR placesa Hard 
Macrosin the CLBarray, according 
to the needs of the other logic. 
However, within a Hard Macro, 
relative CLB locations are not 
changed, and the routing 
advantages designed into the 
macro can be exploited by PPR. 

To create a non-arithmetic 
Hard Macro, the logic schematic 
of the macro is entered normally, 





using FMAPs and HMAPs to de- 
termine the logic partitioning. PPR 
is then used to create an LCA file 
that is edited in the XACT Design 
Editor (XDE) to provide the desired 
placement. Any input and output 
pads are deleted, and the design is 
unrouted. After design rule 
checking (DRC), the resulting LCA 
file can be converted into a Hard 
Macro using HMGEN. Arithmetic 
Hard Macros are best created by 
modifying an existing Hard Macro 
in XDE. 

The creation of user-defined 
Hard Macros is not recommended 
for inexperienced designers. 
Besides requiring use of the XDE, 
defining a Hard Macro requires 
that the designer consider the 
impact of using the macro in the 
LCA device. Inputs and outputs 
must be positioned within the 
macrosuch that they communicate 
efficiently with surrounding logic; 
the effect the macro has on over-all 
routing must also be considered. 
As a further constraint, Hard 
Macros must be rectangular in 
shape; unused resources within the 
rectangle are not available to PPR 
for other logic. 

HMGEN is not part of any 
released Xilinx product but is 
available free of charge from Xilinx. 

BN 
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XC4000 Dedicated Carry Logic 


XC4000-series CLBs contain 
dedicated, hard-wired carry logic 
to accelerate and condense arith- 
metic functions suchasaddersand 
counters. Adders achieve carry 
delays as low as 750 ps/bit, while 
utilizing only half a CLB/bit. This 
is certainly denser than any other 
approach, and in mostcases, faster. 

The dedicated carry logic uses 
a simple ripple scheme for maxi- 
mum flexibility. Adders and 
counters may be of any length, 
start anywhere in a column of 
CLBs, and have their MSB at 
either the top or the bottom. The 
carry canrunup ordownacolumn 
ofCLBs, andcanalsorunsideways 
at the top and bottom, accommo- 
dating very long adders and 
counters; when the adder or 
counter reaches the top or bottom 
ofacolumn, itsimply turns around 
and continues in the next one, and 
with no loss of performance. 

Only the carry path of the 
adder is implemented in dedicated 
logic, as shown in the figure. Be- 
tween CLBs, the carry is routed on 
special interconnect lines that are 
only available to the carry logic. 
This combination of dedicated 
logic and high-speed interconnect 
provides the high carry-propaga- 
tion speed. 

The carry network supple- 
ments the other resources in the 
CLB. The outputs from the carry 
chain are available as inputs to the 
CLB function generators. The 
adder sums are formed in these 
function generators, just as any 
other logic function. 

In addition to the basic adder 
configuration, the carry logic and 
function generators may be con- 
figured to provide a subtracter or 
an adder/subtracter. All three 
functions can be modified to work 
with a single operand, providing 
incrementers and decrementers. 
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These incrementers and 
decrementers, together with the 
CLB flip-flops are used to gener- 
ate the high-speed counters. 

The performance achieved by 
the dedicated carry logic is out- 
standing; 16-bit adders and sub- 
tracters settle in 20.5 ns, and yet 
consume only eight CLBs. 32-bit 
adders and subtracters use 16 
CLBs, and settle in 32.5 ns. 

Loadable up counters and 
down counters use the same 
number of CLBs,and support clock 
frequencies of 40 MHz for 16 bits 
and 27 MHz for 32 bits. Non- 
loadable up/down counters also 
achieve these speeds and CLB 
counts, while loadable up/down 
counters are slightly slower, and 
required additional CLBs. 

Currently, the user can access 





counters that use the carry logic. 
These Hard Macros may be incor- 
porated directly into user designs 
at the schematic level. Alterna- 
tively, the user may generate his 
own Hard Macros using the 
HMGEN utility. 

The third way to access the 
dedicated carry logicis through X- 
BLOX. Allcountersand arithmetic 
functions in X-BLOX are imple- 
mented with the carry logic. A 
Hard Macro of theappropriate size 
and functionality is automatically 
generated and used in the design. 

In the future, it will be pos- 
sible to access the dedicated carry 
logic at the schematic level. 

For more information on the 
dedicated carry logic, please refer 
to the Xilinx Application Note Us- 
ing the Dedicated Carry Logic in 
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Conceptual Diagram of a Typical Addition (2 Bits/CLB) 
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Estimating Adder and 
Counter Performance 


In most LCA designs, perfor- 
mance cannot be estimated with 
any accuracy until afterimplemen- 
tation. This is because the perfor- 
mance is affected by routing 
delays, and, prior to implementa- 
tion, these are not known. How- 
ever, in adders and counters using 
the XC4000 dedicated carry logic, 
delay estimation is possible. 

The carry path in an adder 
uses dedicated interconnects be- 
tween CLBs. These interconnects 
introducea fixed delay,even when 
the carry passes from one CLBcol- 
umn to the next at the top or bot- 
tom of the array. This permits the 
routing delay to be incorporated 
into the CLB specifications pub- 
lished in the data sheet. Asa result, 
the propagation delay through an 
adder can be calculated using only 
data-sheet specifications. 

Fora typical adder, this calcu- 
lation can be reduced to a simple 
formula. Inan XC4000-5, the maxi- 
mum propagation delay from the 
operand input to the sum output 
ofan N-bit adder is approximately 


tg = 8.5+0.75N ns 


This estimate does not include 
the delay from the operand source 
register to the adder or any addi- 
tional delay reaching the destina- 
tion register. However, it is still a 
useful benchmark. 

Foran N-bit counter, the mini- 
mumclock period that permits the 
carry path time to settle is approxi- 
mately 


=13+0.75N ns 


tae = 
For more information on the 
derivationand limitations of these 
formulae, please refer to the Xilinx 
Application Note Estimating the 
Performance of XC4000 Adders and 

Counters (XAPP 018). 
BN 





ELXILINX 


Lower Cost 
for High-Volume Production 


LCA devices are recognized 
as a cost-effective means of imple- 
menting logicduring development 
and for limited volume produc- 
tion. However, for higher volume 
production, it may be necessary to 
reduce component costs. Masked 
gate arrays are an option, but con- 
verting a design to a gate array is 
costly in both time and money. 

For mid-volume production, 
HardWire™ devices offer reduced 
component cost, avoiding the high 
cost of conversion because the 
mask-programmed HardWire de- 
vices are architecturally identical 
to their RAM-programmed coun- 
terparts. All the effort put into 
implementing the original design 
is re-used. 

The HardWire maskis derived 
from the routed LCA file. Conse- 
quently, the HardWire device is 
guaranteed to be logically correct. 
The HardWire device is also guar- 
anteed to meet, or beat, the worst- 
case delays of the RAM-based 
design. Logic partitioning, CLB 
locations and routing are all un- 
changed by the conversion, and 
mask-programmed interconnec- 
tions are always faster. 

Compare thisautomated, low- 
risk approach to the gate-array 
conversion process: first, thenetlist 
must be converted to gate-array 
format, followed by placement, 
routing, and simulation to verify 
the timing. These steps may need 
to be iterated several times to sat- 
isfy a critical timing requirement. 

While gate arrays require a 
new set of test vectors to be devel- 
oped for each design, Xilinx auto- 
matically generates test vectors for 
HardWire devices. Using dedi- 
cated scan-test logic built into the 
device, these test vectors provide 
100% fault coverage. 





HardWire device benefits go 
beyond the ease of conversion. An 
unexpected increaseinthedemand 
for a product that uses HardWire 
devices can easily be accommo- 
dated. Simply revert temporarily 
to the readily available program- 
mable version that fits the same 
socket. Production is not delayed 
by complete dependence ona long- 
lead-time custom product. 

RAM-based devices also sim- 
plify the addition of new features 
to extend the life of a product. 
Develop them ina RAM-program- 
mable device in the production 
board and move the new design 
immediately into production, us- 
ing RAM-programmed devices 
until a new HardWire device is 
available. 

As production typically slows 
at the end of the product life cycle, 
itmay again be more cost-effective 
touse RAM-programmed devices. 
Production is not constrained by 
minimum purchase requirements, 
and the standard devices will con- 
tinue to be available as replace- 
ment parts. 

Virtually all LCA designs are 
suitable for HardWire conversion. 
The only restriction is that correct 
operation may not depend upon 
some minimum delay; the 
HardWire device can be much 
faster. Such asynchronous design 
is considered bad practice, and a 
special design-rule-checking 
(DRC) program is used to evalu- 
ate all designs prior to conversion 
and to flag potential problems. 

The first step in converting an 
LCA design to a HardWire design 
is to submit the design for evalua- 
tion. Once it is approved, proto- 
type HardWire devices are 
available in six weeks. Production 
quantities follow eight weeks later. 


Hard Wire versions of all three LCA families are available. 
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Boundary Scan Simplifies Board Test 


Testing of printed-circuit 
boards can be a major problem. 
While integrated circuits can be 
tested before insertion, there is no 
way to obtain such a high level of 
confidence in the interconnection 
network on which they depend. 
Simple open circuits and short cir- 
cuits in the interconnect are fatal 
flaws, but are often difficult to de- 
tect. 

Connectivity testing is a sig- 
nificant problem. As ICs become 
more complex, it is increasing dif- 
ficult to control and observe the 
printed circuit traces between 
them. Probing the card witha “bed- 
of-nails” tester offers a possible 
solution, but, with denser packag- 
ing and devices surface-mounted 
to both sides of the board, even 
this becomes problematic. 





Simplifying printed-circuit- 
board connectivity testing is the 
primary purpose of boundary- 
scan diagnostics. Circuitry associ- 
ated with each pin permits a 
known signal to be driven onto 
every trace, and the signal to be 
checked at every destination. This 
test can be repeated with different 
signals to detect open and short 
circuits, and this can be done for 
every driver. 

Test vectors are distributed to 
the drivers through a serial trans- 
mission path. The same path, 
which includes all inputs and out- 
puts, isused to recover test results. 
The serial transmission scheme is 
standardized so that different 
vendors’ products can work to- 
gether. Its operation is defined by 
IEEE specification 1149.1, some- 
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times referred to as the JTAG 
specification after the committee 
that originated it. 

IEEE 1149.1/JTAG boundary- 
scan testing is supported in 
XC4000-series LCA devices. Dedi- 
cated logic and pins providea Test 
Access Port (TAP), as defined by 
the standard, and dedicated logic 
in the IOBs provide the boundary- 
scan Data Register and associated 
test functions. 

Built-in self-test of the LCA 
device, which is optional under 
the IEEE specification, is not ex- 
plicitly supported. However, the 
XC4000 LCA architecture permits 
internal logic to be connected to 
the TAP. Test logic can now be 
configured into the LCA device, 
and its operation controlled by the 
boundary-scan test system. 





Circuit 


‘ 
| 
i 
i 


Boundary-Scan Diagnostics 











Implementing Boundary 
Scan in XC3000 


Although XC3000-series LCA 
devices do not contain dedicated 
boundary-scan logic, it is possible 
to configure an XC3000 to emulate 
boundary scan. This emulation 
consumes a significant amount of 
the LCA resources (almostallinan 
XC3020), and it is not suggested 
that boundary scan be built into a 
working design. However, because 
the RAM-based LCA device is 
reconfigurable, it can be config- 
ured for board testing, and then 
reconfigured for operation. 

Four pins must be dedicated 
to the Test Access Port (TAP). Due 
to external interconnection re- 
quirements, these pins can prob- 
ably not be re-used in the actual 
design. The TAP Controller, In- 
struction Register, Bypass Regis- 
ter and Test Data Output Buffer 
together with miscellaneous logic 
require 11 CLBs. 

The CLB requirement for the 
Test Data Register depends upon 
the number of IOBsused,and how 
they are configured. Each requires 
between 0.5 and 1.5 CLBs, plus 
one CLB for each distinct 3-state 
control. While this may not allow 
every IOB to be bidirectional with 
an independent 3-state control, it 
will accommodate most designs. 

A specific boundary-scan 
emulation must becreated foreach 
LCA design. This comprises the 
11 CLBs of core logic, which is 
common to all emulations, and a 
Test Data Register concatenated 
from four macros according to the 
output usage in the design. 

For more information on us- 
ing boundary scan in the XC3000, 
see the Xilinx Application Note 
Boundary Scan Emulator for XC3000 
(XAPP 007). 

BN 
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LCA Performance: 


Ask the Right Question 


Before starting an LCA™ de- 
sign, it is a good idea to do some 
quick performance calculations, 
just to make sure you are in the 
right ballpark. It is tempting to try 
estimating the highest speed that 
the design can achieve. However, 
it is usually much easier, and just 
as useful, to determine whether a 
predetermined speed can be at- 
tained. 

Given the desired clock fre- 
quency, it is easy to estimate the 
logic complexity that can be sup- 
ported. This complexity can then 
be compared to the functional re- 
quirements to determine feasibil- 
ity. Only in marginal cases is a 
complete speed evaluation neces- 
sary. 

Typically, a data path runs 
from a register, through some 
combinatorial logic to another 
register. In an LCA device, the 
shortest data path involves a CLB 
clock-to -output delay plus a CLB 
set-up time: a total of 9.5 ns in an 
XC3000-150. However, this time 
does not include any allowance 
for routing. Adding 4 ns for rout- 
ing, the shortest data path is typi- 
cally 13.5 ns. 

If additional combinatorial 
CLBsare added into the data path, 
each level of CLB adds 4.5 ns, and 
additional routing delay is also 
introduced. Including a typical 
routing allowance, 8.5 ns should 
be added for each level of combi- 
natorial CLB. 

This simple speed-estimating 
procedure can also be reversed. 
If the system clock frequency is 
30 MHz, the33-ns period typically 
Provides for two combinatorial 
CLBs between registered CLBs. 





Clock period 33 ons 
Minimum delay 13.5 ns 
Remaining 19.5 ns 


Each combin. delay 8.5 ns 
# of combin. 
CLBs possible 2 


Including the function gen- 
erator in the destination CLB, a 
total of three function generators 
can be cascaded. Knowing the 
number of function generators that 
can be cascaded, the design can be 
analyzed to determine whether or 
not it is feasible. 

Of course, this is only a very 
rough calculation intended to es- 
tablish feasibility; it neither estab- 
lishes a performance limit, or 
guarantees that a level of perfor- 
mance can be achieved. It does, 
however, give some indication of 
the level of difficulty involved in 
the design. 

In addition, critical areas can 
be identified prior to starting the 
design. It is better to accommo- 
date critical areas from the outset, 
rather than “fix” them later. Con- 
versely, if a design only requires a 
fraction of thecapability available, 
it might be possible to multiplex 
some functions, and providea less 
costly solution. 

For information on how to use 
this procedure in other LCA de- 
vices, see the Xilinx Application 
Note LCA Speed Estimation: Asking 
the Right Question (XAPP 011). 
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Faster Multiplexers in XC3000 


The traditional building block 
for large multiplexers in XC3000 is 
adual 2-input MUX. This building 
block comprises two functions of 
three variables, and uses all five 
inputs to the CLB. A 4-input MUX 
cannot be constructed in a single 
CLB since it requires six inputs. 

Using the dual 2-input MUX, 
larger multiplexers can be con- 
structed using a conventional tree 
approach, with each select bit as- 
sociated with one CLB level. This 
results in 8:1 multiplexers that use 
four CLBs in three levels, and 16:1 
multiplexers that use eight CLBs 
in four levels. 

However, a 3-input MUX can 
be implemented in only one CLB. 
Such3-input MUXscanimplement 
larger multiplexers that have less 
delay, while retaining the binary 
encoding of the select lines. 

The 8:1 multiplexer, shown 
below, also provides an enable 
input. Again, four CLBs are used, 
but with only two levels of delay. 
The enable input permits the mul- 
tiplexer to be expanded using only 
one additional level of CLBs. De- 
coded select lines are used to enable 
up to five 8:1 multiplexers into an 
OR gate. In this way, 3-level multi- 
plexers with up to 40 inputs may 
be constructed. 

For 16:1 multiplexers, the sec- 
ond design uses eight CLBs, and 
again has three levels of delay. It 
also has binary-coded inputs, and 
uses fewer CLBs than two 8:1 mul- 
tiplexers with the necessary ex- 
pansion logic. 

BN 
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When selecting a counter de- 
sign fora specificapplication, there 
are three primary considerations; 
does itmeet the functional require- 
ments, is it fast enough and could 
it use fewer LCA resources? 

The functional requirements 
that must be considered include 
binary /non-binary operation, up, 
down and up/down counting, 
loadability, the provision of set/ 
clear, count enable, and synchro- 
nous operation to permit output 
decoding. Speed and resource uti- 
lization are self-explanatory, and 
can often be traded against each 
other. 

However, it must be realized 
that as a counter becomes more 
complex, it usually becomes both 
larger and slower. The table sum- 
marizes the characteristics of vari- 
ous counter designs available for 
the XC3000. 

For a more detailed descrip- 
tion of the designs mentioned 
below, see the individual 
Application Notes. 


High-Speed Synchronous 
Prescaler Counter (XAPP 001) 
This simple design providesa very 
basic non-loadable, up counter 
witha count-enable control. How- 
ever, this simplicity permits it to 
be both the densest and the second 
fastest design. It is easy to convert 
the design into a down counter, 
but not possible to convert it into 
an up/down counter. 





XC3000 Counters 


Simple, Loadable, Up/Down 
Counter (XAPP 002) 

Being loadable, this counter is un- 
able to benefit from the prescaler 
technique, and a simple ripple- 
carry scheme is used throughout. 
Consequently, it is slower than the 
above design. Themaximumclock 
frequency is inversely proportional 
to the length of the counter; the 
ripple-carry path incurs one T,,, 
delay for each two bits. 

A modification to this counter 
almost doubles the maximum 
clock rate by dividing the carry 
path into two halves. With this 
modification, the carry pathsettles 
in approximately half the time. 
However, this modification re- 
quires one additional CLB. 


Synchronous Presettable 
Counter (XAPP 003) 


In this design, speed is increased 
by replacing the serial gating of 
the ripple-carry path with parallel 
gating. Ideally, with arbitrarily 
wide gates, the carry-path settling 
time could be reduced to one gate 
delay. 

However, with limited gate 
width, the settling time increases 
logarithmically with counter 
length; this is still a significant 
improvement over the linear in- 
crease seen previously, especially 
in longer counters. The additional 
speed is achieved at the cost of 
using more CLBs with more com- 
plex routing. 





Loadable Binary Counter 
(XAPP 004) 


The loadable binary counter also 
uses parallel gating to accelerate 
the carry path. In this case, how- 
ever, a more structured approach 
is taken. A fast lookahead-carry 
technique is used, resulting in a 
carry path witha consistent depth 
of gating. Consequently, there are 
many equally critical paths. 

The regular structure lends it- 
self to hand placement for maxi- 
mum speed. The irregularity and 
smaller number of critical paths of 
XAPP 003 reduces its dependence 
on CLB placement, benefiting the 
automatic placement tools. 
XAPP 003 performance may be 
improved by re-routing a few 
critical paths, but it will not match 
an optimally placed XAPP 004. 


Ultra-Fast Synchronous 
Counters (XAPP 014) 
Insomeapplications, suchas clock 
division, the only requirementis a 
high clock rate. This counter is 
designed to fill that need. It is 
approximately twice as fast as 
XAPP 001 described above, but 
uses almost twice as many CLBs. 

Thekey is theuse ofa prescaler 
technique, together with an active 
Longline to distribute the parallel 
count enable. This distribution 
scheme uses replicated flip-flops 
to eliminate delay but depends 
upon the predictability of the 
binary sequence. 





Counter Performance in XC3000-150 





r 




























































































Loadable | Up | Down) Up/ 8-Bit 10-Bit 12-Bit 16-Bit | 20-Bit | 24-Bit 32-Bit 
Down | MHz | CLBs| MHz |CLBs | MHz/CLBs| MHz |CLBs|MHz|CLBs|MHz|CLBs| MHz |CLBs 
XAPP 001 : go | 5 | 68 | s | 65/9 | 65 | 14 | 63| 17 | 63!/ 21 
XAPP 002 . . : 31 | 8 | 25 | 10 | 25| 12 | 20] 16/15] 20] 13 | 24 
XAPP 002 : : : | | 28 | 17 
XAPP 003 : : : 39 | 8 32 | 15 | 31 | 20 
XAPP 004 : : | | 34 | 23 23 | 49 
XAPP 004 | 30 | 27 23 | 56 
XAPP 014 . 16 [24 | 
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150-MHz Presettable Counter in XC3000 


Prescaling is an established 
technique for high-speed counters. 
Using a derivative of this tech- 
nique, LCA devicescanimplement 
a presettable counter at the full 
150-MHz togglerate ofan XC3000- 
150. These counters can be up to 
24-bits long. 

Ina prescaler counter, a small, 
very fast counter divides the clock 
rate. The divided clockis provided 
to a large, slower counter that is 
unable to settle at the fast clock 
rate. However, even when imple- 
mented synchronously, a conven- 
tional prescaler counter cannot be 
loaded; the technique depends 
upon the predictable binary se- 
quence to ensure that the larger 
counter has adequate settling time. 

If the prescaler counter is 
loaded with an arbitrary value, 
the binary sequence is broken, and 
the settling time of the larger 
counter is no longer guaranteed. 
To ensure an adequate settling 
time, either the clock frequency 
must be reduced significantly, or 
the values that can be loaded must 
be severely restricted. 

To provide presettable pre- 
scaler counters, John Nichols of 
Fairchild Applications introduced 
a pulse-swallowing technique in 
1970. It uses a dual-modulo 
prescaler that can divide the clock 
by 2° or 2"+1. See page 6-38 of the 
Xilinx 1992 Data Book for further 
information. 

Twenty years later, Xilinx de- 
veloped a variation of the the 
pulse-swallowing technique for 
usein LCA devices. This technique, 
called state-skipping, uses a dual- 
modulo prescaler that can divide 
by 2° or 2"-1. 
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Inastate-skipping counter, the 
prescaler is not loaded. Instead, 
the least significant bits of the load 
value are used to initiate a correc- 
tion counter that controls the 
modulus of the prescaler. Conse- 
quently, the larger counter, that 
contains the more significant bits, 
always hasat least 2-1 clock peri- 
ods in which to settle, even after a 
load. 

Typically, the minimum of 
2°-1 clock periods between the load 
and the first clock to the larger 
counter is longer than is required. 
To compensate, the prescaler op- 
erates with its shorter cycle until 
any extra delay has been nullified. 
This compensation is controlled 
automatically by the correction 
counter. 

For example, in a counter us- 
ing +7/+8 prescaler, the value 
loaded might require the firstclock 
to the larger counter occur 5 clock 
periods after the load. In this case, 
the minimum 7-clock cycle period 
of the prescaler delays the first 
clock to the larger counter by two 
periods. 

To nullify this extra delay, the 
prescaler continues dividing by 7 
for a further two cycles, cancelling 
one clock period of the extra delay 
each cycle. The third clock to the 
larger counter occurs 21 periods 
after the load, which is the sameas 
in a conventional counter (5 + 8 + 
8=21 clocks). Once the compensa- 
tion is complete, the prescaler re- 
turns to dividing by 8. 





Clearly, the counter will oper- 
ate in a non-binary manner while 
the correction is being made. Dur- 
ing this time, the counter skips a 
state each cycle of the prescaler, 
hence the name of the technique. 
The maximum time to complete 
the correction is 2"-1 cycles of the 
prescaler. A further consequence 
ofstate-skipping is thatsomesmall 
division ratios cannot be used, 
because the correction cannot be 
completed within the period of 
the counter.In addition, the load 
must be synchronized with the 
prescaler cycle. This happens au- 
tomatically if the counter is loaded 
when it reaches TC. This is com- 
mon practice for timers and divid- 
ers, which are excellent applica- 
tion for state-skipping counters. 

With these exceptions, a state- 
skipping counter may be loaded 
exactly like a conventional binary 
counter. There is no need to modify 
the load value required for any 
given divide ratio, as is necessary 
with a pulse-swallowing counter. 

One advantage of the state- 
skipping technique that is pecu- 
liar to LCA implementation, is that 
a+3/+4 prescaler can be built ina 
single CLB. This is the key to the 
150-MHz presettable counter, 
shown in the Figure. 

The counter uses two state- 
skipping prescalers in cascade. 
Each is a 2-bit dual-modulo 
prescaler that divides by 3 or 4, 
and each has its own correction 
counter. Only the first prescaler is 
clocked by the high-speed clock. 
The maximum clock rate to the 
remainder of the counter is at least 
three times slower. 





G 


The first prescaler is imple- 
mented in a single CLB, and the 
counter design allows the control 
inputs several clock cycles to set 
up. Consequently, the high-speed 
clock is limited only by the toggle 
rate of the flip-flops in this CLB. In 
an XC3000-150 this is 150 MHz. 

The remaining counters, in- 
cluding the first correction counter, 
are all clocked by Q,. This syn- 
chronous operation permits the 
correction counters and Q,-Q,, to 
be loaded by Terminal Count ina 
conventional way. 

In each cycle of the second 
prescaler, only one of the three or 
four first-prescaler cycles can be a 
correction cycle. Consequently, the 
the divide ratios of the composite 
prescaler are limited 11, 12, 15 and 
16, depending on which prescalers 
are correcting. This permits the Q, 
-Q,,counterat least 11 clockcycles 
in which to settle, and distribute 
the parallel enable signal. 
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AST. PRESCALER : " 


Each time a prescaler correc- 
tion cycle occurs, the correspond- 
ing correction counter is 
decremented. Correction cycles 
continue while the correction 
counters are non-zero. When zero 
is reached in either of the correc- 
tion counters, the corresponding 
prescaler ceases correcting, and 
that correction counter remains at 
Zero until it is reloaded. 

Correction can take up to 45 
clock periods to complete, and 
during this timesome counter val- 
ues will be skipped. However, the 
counter behaves ina conventional 
binary manner after less than 46 
clock cycles. Some divide ratios 
below 30cannot be used, since the 
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correction time is greater than the 
counter period, butall divideratios 
of 30 or greater are available. 
State-skipping countersare the 
subject of an upcoming series of 
Applications Notes. Design files 
for the 24-bit 150-MHz Presettable 
Counterareavailableas XAPP 021. 
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Accelerating XC4000 Counters 


The dedicated carry logic in 
XC4000 LCA devices provides a 
mechanism for very fast and effi- 
cient counters. While the ripple- 
carry scheme appears simplistic, 
the hardware implementation of 
the dedicated carry logic is very 
fast, and requires few CLBs. In 
fact, the implementation is so effi- 
cient that it defeats most attempts 
to replace it. It is possible, how- 
ever, to augment the operation of 
the carry logic and obtain higher 
performance. 

To accelerate the counter, the 
effective length of the carry path 
mustbeshortened. Thisisachieved 
by dividing the counter into two 
sections that settle in parallel, as 
shown in the Figure. The carry 
output of the less significant sec- 
tion provides a parallel Count En- 
able (CEP) to the more significant 
section. The use of CEP is most 
often associated with prescaler 
operation, but this is not necessar- 
ily the case. 

Ina prescaler counter, CEP is 
typically decoded from the least 
significant two or three bits. The 
CEP signal is then used to enable 
the remaining bits, such that their 
effective clock rate is one fourth or 
one eighth of the actual clock rate. 
This allows multiple clock periods 
for the remaining bits to settle; the 





whole counter can be operated at 
the speed of the prescaler, in spite 
of the long carry path. 

Using the prescaler technique, 
however, it is not possible to load 
the counter and guarantee that it 
will count correctly on the follow- 
ing clock cycle. The carry chain in 
the more significant bits is de- 
signed to settle in multiple clock 
periods. If these bits are enabled to 
count on the clock following the 
load operation, the carry path will 
not, in general, have had adequate 
settling time. Depending on the 
value loaded, it might not be pos- 
sible to resume counting for sev- 
eral clock periods after the load 
operation. 

This problem may be avoided 
if the carry chain is be divided into 
approximately equal halves, both 
of whichcan settle within the clock 
period. The parallelism inthe carry 
chain reduces the clock period, but 
not as dramatically as with a 
prescaler. However, loadability is 
retained. 

The carry delay is reduced to 
the settling time of the more sig- 
nificant section of the counter, or 
the settling time of the less signifi- 
cant section plus the subsequent 
routing and count enable times, 
whichever is greater. For optimum 
performance, the counter must be 





divided into unequal halves such 
that these times are balanced. 

This technique is most effec- 
tivein long counters. For example, 
a 32-bit counter in an XC4000-5 
can be accelerated from 27 to 34 
MHz. Variations of the technique, 
however, permit some advantage 
to be gained incountersas short as 
six bits. 

In non-loadable counters 
where the prescaler technique is 
used, the critical delay is usually 
the distribution of CEP. Using a 1- 
bit prescaler, this delay can be re- 
duced by replicating the prescaler 
(the LSB of the counter), such that 
everywhere CEP is used, itis avail- 
able from the adjacent CLB. This 
in effect creates an “active 
Longline.” Adding a second 
prescaler stage permits this tech- 
nique to be used in significantly 
long counters. 

Theactive Longline technique 
is expensive in CLBs, but boosts 
theclock frequency to the full shift- 
register frequency of the LCA de- 
vice. In an XC4000-5, this is 110 
MHz. 

For moreinformation on these 
counters, see the Xilinx Applica- 
tion Notes Accelerating Loadable 
Counters in XC4000 (XAPP 023), to 
be published shortly, and Ultra Fast 
Synchronous Counters (XAPP 014). 





PE 


Do- Dia 


Dj-Om-1 





CLK 


PE 





LScouNTER 





D 


Q 








Pace 18 


Qo- Qi-4 
Accelerated N-Bit Counter 






MS COUNTER 














PE D 









Q 


Q9- Om-1 X2551 





C 


g 


ci 


Thin Quad 
Flat Pack 
(TQFP-100) 


Xilinx offers a new, smaller 
outline 100-pin package option for 
the XC2000 and XC3000 families. 
The package body dimensions are 
14mmx 14mmx 1.4mm (0.55" 
x 0.55 “x 0.055”) with 25 gull-wing 
leads on each side. Lead pitch is 
0.5 mm ( 0.02") and the total PC- 
board footprint area is only 16 mm 
x 16 mm ( 0.63" x 0.63"). 

The XC3042 in a TQFP-100 
offers 3000 user-programmable 
gates in the space of a dime. (The 
diameter of a dime is 18 mm, its 
thickness is 1.4 mm). 

This package is ideally suited 
for PCCard and other high-density 
applications. It is also an ideal 
match for the low power 
consumption and __ in-situ 
programmability of the Xilinx 
FPGAs. Anti-fuse or EPROM- 
based programmable devices 
cannot use such a fine-lead 
package, because the insertion into 
a programmer would inevitably 
bend the leads out of alignment. 

XC2018, XC3030, and XC3042 
devices in TQFP are now shipping 
in production volume. 





H-Spice Models 
Are Available 


H-Spice models of Xilinx LCA 
output circuits are available from: 


Meta-Software 

1300 White Oaks Road 
Campbell, CA 95008 
tel: (408) 371-5100 

fax: (408) 371-5638 





LICL) em Wong tt 
ELNIUNX 





SS 


Linear Feedback 
Shift Register 
Counters 


Conventional binary counters 
use complex or wide fan-in logic 
to generate high-end carry signals. 
A much simpler structure sacri- 
fices the binary count sequence, 
butachieves very high speed with 
very simple logic, easily packing 
two bits into every CLB. Linear 
Feedback Register (LFSR) counters 
arealso knownas pseudo-random 
sequence generators. 

An n-bit LFSR counter can 
havea maximum sequence length 
of 2"-1. It goes through all possible 
code permutations except one, 
which is a lock-up state. A maxi- 
mum length n-bit LFSR counter 
consists of an n-bit shift register 
with an XNOR in the feedback 
path from the last output Qn to the 
first input D1. (The XNOR makes 
the lock-up state the all-ones state, 
an XOR would make it the all- 
zeros state. For normal Xilinx ap- 
plications, all-ones is preferred, 
since the flip-flops wake up in the 
all-zeros state.) 

The table below describes the 
outputs that must drive the inputs 
ofthe XNOR. A mutli-inputXNOR 
is also known as an even-parity 
circuit. 

Note that the connections de- 
scribed in this table are not neces- 
sarily unique. Dueto thesymmetry 
of the shift register operation and 
the XNOR function, other connec- 
tions may also result in maximum 
length sequences. 











10-Bit Shift Register 
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n XNOR Feedback from Outputs 
3 3,2 
4 43 
5 5,3 
6 6,5 
rf 7,6 
8 8,6,5,4 
9 9,5 
10 10,7 
ah] 11,9 
12 12,6,4,1 
13 13,4,3,1 
14 14,5,3,1 
15 15,14 
16 16,15,13,4 
17 17,14 
18 18,11 
19 19,6,2,1 
20 20,17 
21 21,19 
22 22,21 
23 23,18 
24 24,23,22,17 
25 25,22 
26 26,6,2,1 
27 27,5,2,1 
28 28,25 
29 29,27 
30 | 30,6,4,1 
31 31,28 
32 32,22,2,1 
33 33,20 
34 34,27,2,1 
35 35,33 
36 36,25 
37 37,5,4,3,2,1 
38 
39 39,35 
40 40,5,4,3 
Examples 


A 10-bit shift register counts 
modulo 1023, if the input D1 is 
driven by the XNOR of Q10 and 
the bit three positions to the left 
(Q7), i.e. a one is shifted into D1 
when Q10 and Q7 have even par- 
ity, which means they are identical. 

An 8-bit shift register counts 
modulo 255 iftheinput D1 isdriven 
by the XNOR of Q8, Q6, Q5, Q4, 
ie.,a one is shifted into D1 if these 
four outputs have even parity, (four 
zeros, or two ones, or four ones). 
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. 
XC4010 Pinout Update for PQ208 
XC4010 Pinouts 
Pin 

PGi91 | Pa208 Description _| PG191 | Pa208 
vO Ki6 80 VO (D3) 19 192 
vo Ki7 8 vo (RS) ug | 133 
vO Kis 82 vO vo_| 134 
vo Lie 83 vo ve | 135 
vo a7 84 vo us| 136 
v0 U6 85 vo 18 137 
vO Mis 86 VO (02) Ww 198 
VO (A10) Gi 190 vo MI7 87 vo u7_|_ 139 
VO(A11) G2 191 v0 Nie 88 vo v6 140 
vo ial 192 vo. P18 89 vo Us 141 
vO SI 193 vo GND Mi6 90 GND 7 142 
GND G3 194 vo vo Ni7 9 vo v5 143 
vo F2 195 vo vO Ris 92 vo v4 144 
vo oO 196 vo vo T18 93 vo US 145, 
vo C1 197 vo vo PI7 94 vo 16 146 
vo E2 198 vo vo NIG 95 vo (01) v3 147 
VO (A12) F3 199 vo vo 17 96 RCLK-BUSY/RDY |_V2 148 
VO (A13) D2 200 vo vo RI7 97 vo ua | 149 
vO Bt 201 vo vo P16 98 vo 15 150 
vo E3 202 SGCK2 (vO) vo ure 99 VO (D0, DIN) U3 151 
VO (Ata) c2 203 Mi ‘SGCK3 (VO) T16 100 SGCK4(DOUT, V0) | T4 152 
ISGCK1 (A15,VO)|_B2 204 GND GND Rie 101 CCLK vi 153 
vec 03 205 ‘MO DONE U7 103 vec Ra 154 
GND D4 2 vec vec RIS 106 TDO U2 159 
PGCK1(A16, VO)[ C3 4 m2 PROG vie 108 GND R3 160 
VO (A17) C4 5 PGCK2 (VO) vO (07) T15 109 vO (AO, WS) 13 161 
vO B3 6 || vo(HDe) PGCK3 (VO) | Ui6 110 PGCK4(A1, V0) | U1 162 
vo cs 7 II vo Ti4 mM vo P3 163 
vo (TDI) A2 8 vo Ui 112 vo R2 | 164 
VO (TCK) Ba 9 vo vi7 113 VO (CS1, A2) 2 165 
vo C6 10 VO (LOC) Vié 114 VO (A3) NB | 166 
vo A3 " vo 13 115 vo P2 167 
vo BS 12 vo Ui4 116 vo il 168 
vO Be 13 vo Vis 117 vo Ri 169 
GND C7 14 vo via 118 vo N2_ | 170 
vo Aa 1s || GND T12 119 GND M3 171 
vo AS 16 || vo U13. 120 vo PY 172 
vO (TMS) 87 7 vo via 121 vo Ni 173 
vo AB 18 vo ui2 122 10 (Aa) M2 | 174 
vo ce 19 vo viz 123 VO (AS) Mi 175 
vo A 20 vo TH 124 vo iE) 176 
vo BB 24 vo Unt 125 vO 2 7 
vo Ae 22 vo vit 126 vo u 178 
vO BO 23 vo vio 127 vo Ki 179 
vo cs 24 vo Ui0 128 VO (Aé) K2 [180 
GND De 25 vO (ERR, INIT) TIO 128 UO (A7) K3 181 
vec DIO 26 vec Rio 130 GND Ka 182 

vo 27 GND Ro 131 














PC68 Pinout Discrepancy 


Page 2-33 of our 1992 Data Book lists pin-outs for the 68 and 84 pin packages for XC3020, XC3030, and 
XC3042. Some designers anticipate a migration of their design from the PC84 to the smaller PC68 package, and 
they carefully use only those PC841/O pins thatarealso availablein the PC68. Unfortunately, this PC84-to PC68 
relationship differs between XC3020 and XC3030. (It differs on pins 12, 13, 14, 15, 16, 21, 22, 31, 32, 33, 40, and 
41.) Our 1991Data Book describes it correctly only for the XC3020 while our 1992 Data Book describes it 
correctly only for the XC3030. 





















































































































































The table below gives the composite description. We apologize for any confusion caused by this 
documentation error. 
XC3000 Family 68-Pin PLCC, 84-Pin PLCC and PGA Pinouts 
aa a Sao] tome | 
XC03030 | XC3020 XC3030, XC3042 84 PLCC | 84 PGA ‘XC3020 XC3030, XC3042 84 PLCC 84 PGA 
10 70 _PWRON | ‘12 B2 44 RESET 54 Ki0 
W W TCLKIN-VO- 13 c2 45 DONE-PG 3 | 55 J10 
12 feend vor 14 Bi 46 07-v0 56 Kit 
13 12 vo 15 C1 47 XTL1(OUT)-BCLKIN-VO 57 an 
14 13 vo 16 D2 48 06-V0 58 H10 
= = vo 7 Py = vo - 59 Hit 
1 | 14 vo 18 E3 49 050 60 F10 
16 15 vo [19 2 50 TS0-vO [6 | Gio | 
= 16 an) 20 et 51 D4-vO 62 Git 
7 7 vo at F2 = vo 63 Go 
18 18 vec 22 Fa | 52 vec | es | Fo 
19 19 vo 23 G3 53 03-10 65 Fit 
= = vo 24 Gi 54 C5i-vo 66 Eu 
| 20 20 vo 25 G2 55 02.0 67 E10 
— 21 vo 26 Fr _ vo 68 E9 
24 22 vo [27 Hi = vo" - 69 On 
22 = vo 28 H2 56 Di-vO 70 D10 
23 23 vo 29 Jt 57 RDY/BUSY-RCLK-vO at cn 
24 24 vo 30 Ki 58 ___D0-DIN-VO _72 Bit 
25 25 M1-RDATA 3 J2 59 DOUT-VO. 73 C10 
26 26 MO-RTRIG 32 ui 60 CCLK 74 7 AN 4 
27 27 M210 33 K2 6 AO-WS-VO 5 B10 
28 28 HDC-10 34 K3 62 A1-CS2-V0 7 | Bo | 
29 29 vo 35 r] 63 2-00 [7 AiO 
30 30 TOC-v0 36 3 64 A3-10 7a | AG 
= cl vo 37 Ka = vor 79 Bs 
= vo: 38 la = vo" 80 Aa 
3 32 vo 39 JS 65 A15-VO 81 BE 
32 33 vo 40 Ks 66 ALO 82 87 
33 = vor 41 Ls 67 A14-VO | 83 Av 
34 34 INIT-vo 42. | ke 68 AS-VO 84 c7 
35 35 GND 43 Je 1 GND = 1 ce 
36 36 vo 44 J 2 A13-VO 2 AG 
37 37 vo 45 7 3 A6-VO 3 AS 
38 38 vo 46 «7 4 A12-V0 : 4 BS 
39 39 vo 47 Ls 5 A7-VO 5 cs 
= 40 vo 48 Ls S vor 6 AS 
— 41 vo 49 K8 - vor 7 Ba 
40 vo" 50 tg 6 A11-VO 8 AZ 
4 vor 51 Lio 7 A8-VO 9 A2 
42 42 vo 52 ko 8 A10-VO 10 B3 
43 43 XTL2(IN)-VO- 53 Lit 9 A9-VO abl AY 



































Unprogrammed IOBs have a default pull-up. This prevents an undefined pad level for unbondad or unused IOBs 
Programmed outputs are default slew-rate limited. 


his table describes the pinouts of three different chips in three different packages. The second column lists 84 of the 118 pads on the XC3042 (and 

the 98 pads on the XC3030) that are connected to the 84 package pins. Ten pads, indicated by an asterisk, do not exist on the XC3020, which 
has 74 pads; therefore the corresponding pins on the 84-pin packages have no connections to an XC3020. Six pads on the XC3020 and 16 pads on 
the XC3030, indicated by a dash (—-) in the 68 PLCC column, have no connection to the 68 PLCC, but are connected to the 84-pin packages, 
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Programmable Gate Array Training Courses 


The focus of our training de- 
partment is to provide high-qual- 
ity, comprehensive training for our 
customers on how to use our 
products, both software and ICs. 
We offer a variety of classes on 
different platforms and in various 
locations, thus providing a way 
for our customers to get up-to- 
speed and become productive as 
quickly and efficiently as possible. 

We are offering two different 
classes on our XC3000 family 
products: A 2-day class gives a 
very good overview for beginning 
users, and a 4-day class that 
provides a more in-depth look at 
the architecture, development 
system features, and recom- 
mended design methodology. Both 
of these classes are for ALL users 
of XC2000 and XC3000 family 
products. The classes start at the 
introductory level, but quickly 
move into more detail, so that ex- 
perienced users also find these 
classes beneficial. 


We also offer a 2-day XC4000 
family class that is intended for 
designers whoarealready familiar 
with our XC2000 or XC3000 family 
devices. It covers the XC4000 
architecture, development system 
features, and recommended 
design methodology. 

For new XC4000 customers 
that are not familiar with Xilinx 
FPGAs, we recommend a 4-day 
XC3000/XC4000 family combina- 
tion course. This introductory class 
covers all of our products, and, 
shows how they fit together. The 
background information learned 
is invaluable, since parts of some 
applications may well fit into an 
XC3000 device, possibly providing 
a more cost-effective solution. 

All of our classes include 
hands-on lab exercises giving you 
the opportunity to gain the 
experience you need to be 
productive immediately after you 
return to your office. Most of the 





XC3000 


Day 1 XACT Design Manager 
XMAKE Automatic Translation - Lab 
Basic Architecture 
Estimating Size 
Design Entry - Lab 
Day 2 Design Implementation 
XNFMAP Partitioning - Lab 
APR Placement and Routing - Lab 
MAKEBITS Bitstream Generator 
MAKEPROM PROM Formatter 
Day 3 Configuration 
Design Verification 
Simulation - Lab 
Downloading 
Day 4 EDITLCA Graphical Editor - Lab 
Architecture Details - Lab 
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Course Programs 


XC4000 


Day 1 XACT Design Manager 
XMAKE Automatic Translation - Lab 
Basic Architecture 
Estimating Size 
Design Entry - Lab 
MEMGEN RAM/ROM Compiler - Lab 
Day 2 Design Implementation 
PPR Partitioning, Placement, & Routing 
Configuration - Lab 
MAKEBITS Bitstream Generator - Lab 
MAKEPROM PROM Formatter 
Design Verification 
Downloading - Lab 
Readback 
EDITLCA Graphical Editor - Lab 





classes use PCs, but some of the 
training centers also have 
SPARCstations. Our focus is on 
the process of designing, not on 
the specific platform or schematic 
entry tools. 

We offer these classes in our 
factory in San Jose, California, and 
also in Regional Training Centers 
worldwide. Please consult the 
schedule on the next page for 
classes in your area. We can pro- 
vide any of these classes, or cus- 
tom-tailored classes, at you own 
facility. If you have any questions, 
please contact your local Xilinx 
sales office. 

The standard price of the 
classes is $1000 per student for the 
4-day classes,and $750 per student 
for the 2-day classes. These are US 
prices, and vary in the international 
locations. 

RR 


XC3000 & XC4000 


Day 1 XACT Design Manager 
XMAKE Automatic Translation - Lab 
XC3000 Basic Architecture 
XC3000 Estimating Size 
XC3000 Design Entry - Lab 

Day 2 XC3000 Design Implementation - Lab 
MAKEBITS Bitstream Generator 
MAKEPROM PROM Formatter 
Downloading 
XC3000 Configuration 
Design Verification 

Day 3. EDITLCA Graphical Editor - Lab 
XC4000 Basic Architecture 
XC4000 Estimating Size 
XC4000 Design Entry - Lab 

Day 4 XC4000 MEMGEN 
RAM/ROM Compiler - Lab 
XC4000 Design Implementation 
XC4000 Configuration - Lab 
XC4000 Downloading - Lab 
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ELXILINX 





Applications Handbook 


Asa part of an on-going com- 
mitment tocustomer support, Xilinx 
has just published the first XAPP 
Applications Handbook. The first 
edition contains 16 application notes 
that address a wide range of topics; 
more application notes are already 
in progress. 

The handbook contains in- 
depth explanations of LCA features 
and design examples, showing how 
best to utilize the LCA device. Most 
of the design examples have been 
fully implemented, and Viewlogic* 
schematic files are available. 

These files illustrate the com- 
plete implementation process, and 
provide macros that may be used in 
other designs. Copies of these de- 
sign files can be obtained through 
the Xilinx Technical Bulletin Board, 
or by calling the Xilinx Applica- 
tions Hotline for a disk. 

Contact Xilinx for your copy of 
the XAPP Applications Handbook. 
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* Viewlogic is a trademark of Viewlogic 
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